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Abstract 

This  paper  considers  the  problem  of  existential  quantifier  elimination 
for  real  algebra  (QE).  It  introduces  an  algorithmic  framework  for  exploring 
reformulations  of  QE  problems,  with  the  goal  of  finding  reformulations 
that  make  difficult  problems  tractable  for  QE  implementations,  or  for 
which  these  implementations  find  simpler  solutions.  The  program  qfr  is 
introduced,  which  implements  this  approach,  and  its  performance  on  some 
example  problems  is  reported. 


1  Introduction 


“Why  do  it  like  a  human  when  you  can  do  it  right.”  —  Joel  Moses 

The  above  quote  is  indicative  of  a  philosophy  underlying  computer  algebra. 
We  don’t  emulate  human  methods  of  solution  in  our  algorithms  for  problems 
like  integration,  summation,  factorization,  solutions  of  systems  of  polynomial 
equations  and  a  host  of  other  problems.  In  fact,  humans  aren’t  very  good  at 
these  problems,  while  computer  algebra  programs  have  proven  to  be  remarkably 
effective.  Why  approach  a  mathematical  problem  like  a  person?  Why  not  do  it 
right? 

This  paper  considers  a  very  fundamental  problem  -  quantifier  elimination  for 
the  first  order  theory  of  real  algebra.  Roughly  speaking,  this  means  determining 
the  satisfiability  of  systems  of  polynomial  equalities  and  inequalities  over  the 
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reals,  where  the  polynomials  involved  may  have  coefficients  that  are  expressions 
in  some  parameters.  When  coefficients  are  rational  functions  of  the  parameters, 
this  problem  is  solvable,  and  there  has  been  a  lot  of  work  over  many  years  on 
algorithms  and  software  for  doing  it.  The  general  approaches  taken  Tarski’s 
original  approach  [11],  Collins’  cylindrical  algebraic  decomposition  [5],  and  the 
“root  counting”  approach  of  Weispfenning  [15]  and  of  Heintz,  Roy  and  Salerno 
[9]  —  are  typical  of  computer  algebra  in  that  they  do  not  emulate  the  ways  in 
which  people  solve  such  problems. 


The  thesis  of  the  paper  is  that  this  subject  would  benefit  from  a  bit  of  trying 
to  “do  it  like  a  human”  rather  than  just  “doing  it  right” .  While  it  is  true  that 
humans  are  not  good  at  solving  these  problems,  neither  are  machines.  The 
fundamental  problem  is  computationally  intractable  in  any  traditional  sense. 
What  humans  can  do,  however,  that  these  algorithms  cannot,  is  flexibly  exploit 
problem-specific  structure  —  the  kind  of  structure  that  real-world  problems 
have.  The  following  problem,  which  comes  from  epidemiological  modeling  [12], 
is  such  an  example. 


3S,E,  I,  T 


d  —  dS  —  (3\ IS  =  0  A  vE  —  (d  +  r2)I  =  OA 

f3\IS  +  f32IT  —  (d  +  v  +  r\)E  +  (1  —  q)r2I  =  OA 

— dT  +  r\E  +  qr2I  —  /32TI  =  0/\E>0Al  >  0  AT  >  0  A  S  >  0 


where  (i>0Au>0Ari>0Ar2>0A<7>0A/3i>/32>0 

None  of  the  software  systems  we  have  tried  on  this  problem1,  as  formulated, 
can  solve  it  —  at  least  not  in  the  amount  of  time  we  were  willing  to  wait. 
However,  in  the  cited  paper,  the  authors  solve  it  by  hand.  Several  [all?]  of  the 
above  systems  can  be  coaxed  into  solving  the  problem  if  we  first  do  what  any 
normal  person  would  do:  solve  for  linearly  occurring  variables  and  substitute. 
Doing  this,  which  is  a  bit  tedious,  we  can  eliminate  all  but  one  of  the  variables. 
Software  can  take  care  of  the  last  one. 


Actually,  there  is  another  sense  in  which  the  way  people  solve  problems  like 
the  above  is  better  than  that  of  our  algorithms:  people  produce  simple  an¬ 
swers.  In  solving  such  problems  we  seek  out  steps  that  keep  expressions  simple, 
and  we  systematically  exploit  the  problem’s  structure  to  rule  out  special  cases, 
which  has  a  lot  to  do  with  our  ability  to  solve  these  problems  while  automated 
procedures  fail. 

This  paper’s  major  goal  is  to  provide  an  efficient  algorithm/data-structure  foun¬ 
dation  for  attacking  these  problems  in  a  human-like  manner  inside  a  program. 

It  is  important  to  note  that  we  are  not  suggesting  that  quantifier  elimination 
problems  be  solved  solely  by  mimicking  people,  rather  that  programs  should  try 

1We  tried:  QEPCADB  vl.45,  Redlog  v3.0,  Mathematica  v5.2  and  the  RS  function  of  the 
Salsa  Maple  packages  from  INRIA. 
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to  exploit  problem-specific  structure  like  people  do.  Typically,  this  will  generate 
one  or  more  simpler  quantifier  elimination  problems  that  would  then  be  solved 
by  regular  quantifier  elimination  programs.  It  is  also  important  to  note  that  this 
approach  has  nothing  to  offer  for  many  kinds  of  QE  problems.  Higher  degree 
problems,  problems  without  equations,  very  generic  problems  ...  problems  like 
these  that  humans  can’t  make  much  progress  with  will  not  benefit.  However, 
we  will  exhibit  a  variety  of  application  problems  from  diverse  sources  that  our 
approach  does  apply  to,  and  does  succeed,  in  combination,  with  other  programs, 
to  give  good  quantifier  free  solutions  in  a  reasonable  amount  of  time. 


1.1  A  more  precise  formulation  of  the  problem 

The  kind  of  human  problem  solving  we  are  trying  to  emulate  basically  amounts 
to  rewriting  or  reformulating  existential  quantifier  elimination  (QE)  problems. 
Starting  with  an  initial  problem,  we  eliminate  variables  by  substitution  when 
they  occur  linearly  in  equations;  we  split  into  cases  depending  on,  for  example, 
whether  a  leading  coefficient  is  zero  or  not;  we  remove  redundancies  like  0  <  b 
ina<0A6<0Aa<6;  and  much  more  in  the  same  vein.  We  arrive  at  one  or 
more  simpler  (we  hope!)  problems  that  we  can’t  do  any  more  with,  and  these 
will  have  to  be  solved  by  other  methods.  Thus,  the  fundamental  problem  is 
this:  Given  an  existential  QE  problem  P  and  a  set  of  rewriting  operators,  find 
the  “best”  reformulation  of  P  as  one  or  more  simpler  QE  problems. 

Ultimately,  one  would  be  interested  in  finding  the  best  rewriting  —  or  even 
just  a  good  rewriting  —  quickly.  For  this  paper,  however,  we  only  consider  the 
problem  of  generating  all  rewritings  and  finding  the  best.  Our  main  result  is 
an  algorithm  that  efficiently  searches  the  space  of  all  rewritings  of  a  given  QE 
problem  to  find  the  one  that  is  “best”  with  respect  to  a  suitable  metric.  This 
algorithm  provides  a  method  for  comparing  fast,  heuristic-based  algorithms  for 
rewriting  in  future  work,  and  a  foundation  on  which  such  algorithms  can  be 
based. 


2  The  “space  of  rewritings” 


One  of  the  basic  concepts  behind  the  AI  search  paradigm  is  that  of  the  state 
space.  An  initial  state  and  a  set  of  operators  that  map  states  to  potential 
successor  states  together  generate  a  state  space.  In  our  case,  states  are  formulas. 
The  initial  state  is  the  input  formula,  and  operators  produce  new  formulas  that 
are  logically  equivalent  over  the  reals.  One  such  operator,  for  example,  rewrites 

formulas  of  the  form  ax+b  =  OAF  as  ^OA  F\Xi _ b V  (a  =  0  A  6  =  0AE). 

The  basic  idea  is,  starting  with  an  input  formula  and  a  set  of  rewrite  operators, 
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to  explore  the  state  space  and  return  the  rewriting  that  we  deem  to  be  “best” . 
This  differs  from  the  usual  AI  search  scenario,  because  there’s  no  goal  test 
there’s  no  way  to  look  at  a  single  state  in  isolation  and  determine  that  it  is  “the 
answer” . 

There  are  many  rewritings  that  are  completely  uninteresting.  For  example, 
rewriting  f5  >  0  as  /3  >  0  isn’t  interesting.  In  fact,  we  have  no  reason  to  ever 
want  to  see  either  expression  in  any  formula,  since  they  are  both  equivalent  to 
/  >  0.  As  another  example,  consider  FVi2  +  1<0AG.  Any  rewritings  of  G 
are  completely  uninteresting,  since  the  disjunct  it  is  part  of  is  equivalent  to  false 
anyway.  Rewriting  G  is  a  waste  of  time  and  space.  What  these  two  examples 
point  to  is  the  need  to  distinguish  between  what  we  call  “normalization”  and 
the  rewritings  performed  by  operators.  Normalization  involves  rewritings  we 
will  always  want  to  carry  out:  e.g.  f5  >  0  — ►  /  >  0,  or  x2  +  1  <  0  — ►  false. 
Rewrite  operators  carry  out  rewritings  that  are  not  always  desirable. 

Another  issue  is  the  need  recognize  states  that  are  the  same  as  states  that  have 
already  been  processed.  In  different  search  contexts,  “the  same”  might  mean 
different  things.  In  our  context,  all  states  are  logically  equivalent  over  the  reals, 
so  defining  “the  same”  too  deeply  is  counterproductive.  We  take  “the  same”  to 
mean  syntactically  identical  after  normalization. 

The  straightforward  approach  to  generating  the  space  of  rewritings  then  is  given 
in  Algorithm  1.  If  no  operator  introduces  disjunctions,  this  kind  of  search 


Algorithm  1  Naive  exploration  of  the  space  of  rewritings. 

1:  enqueue  F  in  A 
2:  while  Q  not  empty  do 
3:  G  :=  dequeue  Q 

4:  for  each  rewriting  operator  application  op  do 

5:  G'  :=  result  of  applying  op  to  G 

6:  G"  :=  normalization  of  G' 

7:  enqueue  G"  in  Q  (unless  G"  has  already  been  generated) 


might  be  reasonable.  However,  when  disjunctions  can  be  introduced  (and  this 
is  the  case  with  most  interesting  operators),  this  naive  search  suffers  from  two 
debilitating  sources  of  inefficiency:  Suppose  F\  V  F2  V  •  •  •  V  Fk  is  a  formula 
encountered  during  the  search. 


1.  If  opi, . . . ,  opk  are  operators  such  that  opi  acts  on  disjunct  Ft.  there  are  2fe 
distinct  formulas  generated  by  different  orders  of  applying  the  operators 
(barring  coincidental  duplications).  In  other  words,  even  though  there  is 
one  “destination”,  namely  opi(Fi)  V  op2(F2)  V  •  •  •  V  opk(Fk),  there  are  2fc 
steps  if  you  traverse  all  possible  paths  getting  there,  because  of  all  the 
different  orders  one  can  use  to  apply  operators. 
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2.  If,  for  each  i,  there  are  st  operators  that  can  be  applied  to  Ft.  then  even 
forgetting  about  the  orders  in  which  operators  are  applied  to  disjuncts, 
there  are  (1  +  Si)(l  +  S2)  •  •  •  (1  +  Sk )  ways  that  a  new  formula  can  be 
generated  by  applying  no  more  than  one  operator  do  each  disjunct. 


Of  course,  these  two  factors  combine  to  produce  an  enormous  number  of  possible 
rewritings.  Moreover,  in  neither  case  have  we  considered  applying  a  rewrite 
operator  to  the  result  of  a  rewrite  operator  —  e.g.  “substitute  x  =  —b/c  into  F 
then  substitute  y  =  —d/e  in  the  result”. 

As  described  in  [3],  Christian  Gross  implemented  a  heuristically  guided  ver¬ 
sion  of  above  algorithm,  which  prunes  away  parts  of  the  search  tree  based  on  a 
function  for  “grading”  formulas.  This  program  is  able  to  find  good  rewritings 
for  several  interesting  inputs,  but  only  when  they  are  relatively  easy  to  find, 
since  the  space  it  has  to  search  is  so  vast.  In  fact,  it  discovers  very  few  dis¬ 
tinct  formulas,  spending  most  of  its  time  discovering  duplicates.  Moreover,  the 
only  operators  it  includes  are  factor  splitting  for  equations  and  linear  substitu¬ 
tion.  Presumably,  incorporating  more  operators  would  exacerbate  the  problems 
associated  with  search  space  size. 

Gross’s  implementation  is  intended  to  serve  as  a  “preprocessor”  for  quantifier 
elimination  program;  primarily  for  QEPCADB,  but  also  for  Redlog  or  Math- 
ematica.  For  several  problems  from  different  domains,  it  is  able  to  find  an 
input  rewriting  that  make  an  intractable  problem  tractable  for  these  systems, 
or  improve  the  quality  of  answer.  Both  the  successes  of  this  program  and  its 
shortcomings  motivate  the  search  for  a  better  approach  to  exploring  the  space 
of  rewritings. 


3  Searching  more  efficiently 


The  first  idea  for  improving  the  search  through  the  space  of  rewritings  is  to 
decouple  disjuncts  in  a  formula.  The  two  major  factors  cited  in  the  previous 
section  can  both  be  avoided  by  searching  for  rewritings  of  disjuncts  indepen¬ 
dently.  Given  formula  F  V  G,  we  generate  the  set  Sf  of  rewritings  of  F  and  Sg 
of  rewritings  of  G.  These  two  sets  represent  the  Sf  •  \Sg |  rewritings  obtained 
by  forming  the  disjunction  of  an  element  of  Sf  and  an  element  of  Sg-  Of  course, 
this  process  is  recursive,  meaning  that  in  searching  for  rewritings  of  F  or  G  we 
might  form  new  disjunctions,  and  we  should  treat  the  search  for  rewritings  of 
each  disjunct  as  independent  problems. 

The  second  idea  for  improving  the  search  is  that  since  we  expect  to  generate 
the  same  disjuncts  in  different  ways,  and  since  we  expect  to  discover  that  some 
disjuncts  are  unsatisfiable,  we  should  be  able  to  exploit  of  such  information. 
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Figure  1:  A  simple  rewrite  graph  representing  the  following  formulas  as  equiv¬ 
alent:  F\ ,  F'2 .  Gi  V  Hi,  Gi  V  i?2i  G2  V  H±,  G2  V  f?2- 


In  the  next  section  we  describe  a  data  structure,  the  “rewrite  graph”,  that 
allows  us  to  implement  the  above  ideas  and  thus  to  search  much  more  ef¬ 
fectively.  Nodes  in  this  graph  are  either  Q-nodes,  which  represent  a  set  of 
equivalent  formula,  or  OR-nodes  which  represent  disjunctions.  The  simple 
graph  in  Figure  1  represents  the  following  formulas,  all  of  which  are  equiva¬ 
lent:  F[ ,  F 2 ,  Gi  V  Hi,  Gi  V  H2,  G2  V  Hi,  G2  V  H2.  We  also  show  how  this  data 
structure  can  be  reorganized  when  a  formula  in  a  node  is  found  to  be  false, 
or  when  two  nodes  are  found  to  contain  the  same  subformula.  We  will  then 
reformulate  our  search  as  the  process  of  choosing  an  unprocessed  formula  from 
some  Q-node  in  the  rewrite  graph,  applying  operators  to  it  and  modifying  the 
graph  accordingly.  An  implementation  of  this  approach  is  described  later  in  the 
paper,  along  with  some  empirical  results  on  its  performance. 


4  The  primary  data  structure 


If  S'  is  a  set  of  equivalent  formulas  then,  abusing  notation  a  bit,  we  will  allow  S 
to  appear  as  a  formula  in  expressions,  with  the  same  meaning  as  using  any  one 
of  its  elements  in  its  place. 

Let  F  be  an  existentially  quantified  formula  with  free  variables  Xi  and 

bound  variables  Xk+i,  ■  ■  ■  ,xn.  We  describe  a  data  structure  called  a  rewrite 
graph  (RG). 


Definition  1  A  Rewrite  Graph  (Vq,Vor,  E,  S)  is  a  directed  acyclic,  bipartite 
graph  ( Vq,Vor,  E )  with  function  S  that  maps  elements  o/Vq  to  Tarski  formu¬ 
las,  such  that: 


•  For  uQ-node”  v  £  Vq,  all  elements  of  S(v)  (which  we  sometimes  write 
Sv)  are  mutually  equivalent. 
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•  If  uOR-node”  w  G  Vor  is  a  child  of  Q-node  v,  and  Q-nodes  u±, . . . ,  ua  are 
the  children  of  w,  then  S(v)  <G>  V“=i  S(ui)- 

Rewrite  graphs  give  us  a  compact  way  to  represent  a  very  large  number  of 
rewritings  of  a  formula.  The  following  definition  and  theorem  make  that  precise. 


Definition  2  Given  rewrite  graph  G  =  (Vq,  Vor ,  E,  S),  define  the  function  H 
mapping  vertices  into  Tarski  formulas  as  follows: 

1.  If  v  is  a  Q-node  with  children  Wi, . . . ,  wa, 

H(v )  =  S(v)  U  H(w i)  U  H(w 2)  U  •  •  •  U  H(wa )• 

2.  If  w  is  an  OR-node  with  Q-node  children  u  1, . . .  ,ua, 

H(W)  =  {/ \“=i/j  |  (/l,...,/a)  G  H{ui)  X  •••  X  H(ua)}. 


Theorem  1  Given  rewrite  graph  G  =  (Vq,Vor,  E,  S)  and  node  v  G  V,  every 
formula  in  H(v )  is  equivalent. 


Proof.  This  is  an  obvious  consequence  of  the  definition  of  rewrite  graphs.  i_] 


The  general  idea  is  that  we  start  with  some  formula  F  and  construct  a  rewrite 
graph  G ,  rooted  at  node  r,  such  that  F  G  S(r).  Then  H(r)  gives  the  set  of 
rewritings  of  F. 

We  will  allow  ourselves  a  further  abuse  of  notation  by  using  RG- nodes  as  for¬ 
mulas,  e.g.  Xi  >  0  A  v,  where  v  is  an  RG-node.  In  this  context,  v  has  the  same 
meaning  as  any  element  of  H(v). 

RG’s  play  a  key  role  in  our  algorithm  for  exploring  rewritings  of  an  existentially 
quantified  formula.  Not  only  do  they  provide  a  compact  representation  of  a 
large  number  of  possible  rewritings,  they  give  a  context  to  information  that 
gets  discovered  during  the  rewriting  process,  which  allows  us  to  exploit  such 
discoveries. 


Theorem  2  Let  G  =  (Vq,  Vor,  E,  S )  be  a  rewrite  graph. 


1.  For  any  i,  1  <  i  <  k,  and  a  G  R,  G\Xi-a  =  (Vq,Vor,  E,  S'),  where 
S’(v)  =  {/ \Xi—a  |  /  G  S'(u)}),  is  also  a  rewrite  graph.  Essentially  this 
says  that  assigning  a  value  to  a  free  variable  in  a  rewrite  graph  yields  a 
rewrite  graph. 

2.  If  for  Q-nodes  u,v  G  Vq  we  have  Su  n  Sv  yf  0,  Sv  <G>  Su. 
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3.  If  for  Q-node  v  any  element  of  Sv  is  found  to  be  equivalent  to  false,  then 
v  and  all  descendent  nodes  are  equivalent  to  false. 

f.  If  for  Q-node  v  any  element  of  Sv  is  found  to  be  equivalent  to  true,  then 
all  nodes  along  any  path  to  v  are  equivalent  to  true. 

5.  Let  v\  — y  W\  — >  i>2  — ►  •  •  •  — ►  ws  — >  us+i  be  a  path  in  G,  where  Q- 

nodes  v\  and  are  known  to  be  equivalent.  Then  v\,  V2,  ■  ■  >xvs,  ns+i 
are  all  equivalent.  (From  which  it  trivially  follows  that  wi, . . .  ,ws  are  all 
equivalent  to  Vi, ... ,  vs+i.) 


Proof.  The  first  four  points  are  easily  seen  to  be  true,  so  we  only  explicitly 
prove  the  last  point.  We  will  show  that  V\  <t=>  vs,  and  the  conclusion  follows  by 
induction.  In  fact,  it  suffices  to  show  that  v\  <t=>  ws,  since  by  our  hypotheses 
vs  <t=>  ws.  So,  let  a  be  a  point  in  free-variable  space.  If  v\\a  is  true  then  ns+i|a  is 
true,  by  hypothesis,  and  that  in  turn  means  ws\a  is  true,  since  us+i  is  a  disjunct 
of  ws.  Conversely,  if  ws\ a  is  true,  then  ws|a  is  true,  which  means  that,  by  point 
1  and  4,  v\  is  true.  Q 


4.1  Reduced  rewrite  graphs:  exploiting  what  we  learn 
about  formulas 


In  exploring  the  space  of  rewritings  of  a  formula,  two  events  can  occur  that 
provide  information  that  we  would  like  to  exploit:  we  may  discover  that  some 
formula  is  unsatisfiable,  or  we  may  discover  that  two  formulas  obtained  through 
different  rewritings  are  identical.  The  rewrite  graph  allows  us  to  exploit  this 
information. 


Definition  3  A  Reduced  Rewrite  Graph  (RRG)  is  a  rewrite  graph  satisfying 
some  additional  restrictions: 

1.  There  is  no  vertex  v  £  Vq  such  that  S(v)  contains  true  or  false. 

2.  There  are  no  two  vertices  u,v  £  Vq  such  that  S(v)  D  S(u)  ^  0. 

3.  There  are  no  two  vertices  in  Vor  with  the  same  out-neighbor  set. 

4.  Each  vertex  in  Vor  has  a  unique  in-neighbor. 

5.  Each  OR-node  has  at  least  two  children. 


The  motivation  for  this  definition  is  the  goal  of  keeping  the  rewrite  graph  as 
simple  as  possible  while  still  describing  all  of  the  same  “interesting”  rewritings. 


Rewritings  with  constants,  known  redundancies,  or  inconsistencies  are  not  “in¬ 
teresting”  .  Point  1  of  the  definition  means  we  don’t  allow  constants,  points  1  &  4 
mean  that  we  don’t  allow  distinct  but  obviously  equivalent  Q-nodes,  and  point 
3  means  we  don’t  allow  distinct  but  obviously  equivalent  OR-nodes.  Nicely 
enough,  Theorem  2  provides  us  with  the  tools  we  need  to  algorithmically  trans¬ 
form  an  arbitrary  rewrite  graph  into  a  reduced  rewrite  graph.  The  principal 
work  in  that  process  is  done  by  Algorithm  2. 

Algorithm  2  takes  as  input  a  rewrite  graph  G  =  (Vq,Vor,  E,  S)  and  a  queue 
Q  of  assertions  of  the  form  (a,  6),  where  a,  b  £  Vq  U  {false}.  Extending  the 
function  S  by  defining  S(false)  =  {false},  assertion  (a,  b)  is  the  statement 
5(a)  <t=>  S(b).  The  algorithm  transforms  G  by  deleting  any  nodes  asserted  to  be 
false,  and  merging  any  nodes  asserted  to  be  equivalent.  However,  it  accomplishes 
this  in  such  a  way  that  G  remains  a  rewrite  graph.  Moreover,  in  this  process 
no  rewritings  are  “lost”.  More  precisely,  if  a  rewriting  is  represented  by  G , 
the  same  rewriting  exists  in  the  RG  produced  by  the  algorithm  —  except  that 
some  disjuncts  may  be  removed  if  the  assertions  in  Q  and/or  the  structure  of 
G  determine  that  they  are  redundant. 

One  complicating  issue  in  presenting  Algorithm  2  is  that  as  nodes  are  merged 
or  deleted,  the  node  names  in  Q  may  become  stale.  For  example,  if  (a,  b)  is 
dequeued,  where  a  and  b  are  nodes,  but  node  b  has  already  been  merged  with 
other  nodes,  what  do  we  do?  The  algorithm  deals  with  this  by  keeping  track  of 
“aliases”.  If  node  a  is  deleted  because  a  was  asserted  to  be  false,  the  name  a 
is  then  considered  to  be  an  alias  for  false.  If  node  b  is  merged  with  others  into 
a  new  node  c,  then  b  is  considered  to  be  an  alias  for  c.  When  a  possibly  stale 
name  a  is  encountered,  the  algorithm  simply  asks  for  the  true  name  of  a  before 
doing  anything.  The  functions  record  and  trueName  provide  this  bookkeeping. 


Theorem  3  Let  G  =  (Vq,Vor,  E,  S)  be  a  rewrite  graph  with  root  r,  and  sup¬ 
pose  Q  contains  (tti,  iq), . . . ,  (it*,,  Vh).  Let  G'  =  (Vq,VqR,  E' ,  S')  be  the  graph 
resulting  from  applying  Algorithm  2  to  G  and  Q. 

1.  G'  is  a  rewrite  graph  under  the  assumption  f\(ui  4$  Vi) 

2.  For  1  <  i  <  k,  trueName(ui)  =  trueName(vi)  after  Algorithm  2  finishes. 

3.  if  C\  V  ...  V  Cm  £  Hg(t),  then  Hq' (trueName(r))  contains  some  formula 
Ch  V  •  •  •  V  Cin,  where  {i i, . . . ,  in}  C  {1, . . . ,  m},  such  that 

f \(ui  44  Vi)  =4  (Ci  V  ...  V  Cm)  44  (Cjj  V  •  •  •  V  Cin) 


Proof.  First  we  observe  that  the  algorithm  terminates.  At  every  iteration  an 
element  is  dequeued  from  Q.  In  trivial  iterations,  i.e.  when  u  =  v  in  line  4, 
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Algorithm  2  Reorganize  G  =  ( Vq,Vor,E )  based  on  Q ,  a  queue  information 
elements  from  ( Vq  U  {false})2 
1:  while  Q  not  empty  do 
2:  (u,v)  :=  dequeue(<5) 

3:  u  :=  trueName(w),  v  :=  trueName(u) 

4:  if  u  =  v  then  continue 

5:  if  v  =  false  then  swap(it,i>) 

6:  if  u  =  false  then 

7:  record (v,  false),  i.e.  record  that  v  is  an  alias  for  false 

8:  for  each  child  w  of  v  do 

9:  for  each  child  x  of  w  do 

10:  enqueue((/cdse,  x),  Q) 

11:  for  each  parent  v'  ^  v  of  w  do 

12:  enqueu  e((false,v'),Q) 

13:  delete  v 

14:  else 

15:  set  Wq  to  the  set  of  all  Q-nodes  along  paths  from  u  to  v  or  v  to  u 

16:  set  Wor  to  the  set  of  all  OR-nodes  along  paths  from  u  to  v  or  v  to  u 

17:  let  a;  be  a  new  Q-node  and  define  S(x)  =  Utg{u  «}u wQ  S(t) 

18:  for  each  s  €  Vq  —  Wq  such  that  (s,  iv)  £  E,  for  some  w  £  Wor  do 

19:  enqueue((s,  x)) 

20:  delete  all  nodes  in  Wor 

21:  for  each  w  £  Wq  U  {«,  v}  do 

22:  record(u>,  x),  i.e.  record  that  w  is  an  alias  for  x 

23:  contract  the  elements  of  {u,  v}  U  Wq  into  the  new  node  x 

24:  delete  any  nodes  that  are  unreachable  from  the  original  root 
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nothing  is  enqueued  in  Q.  In  each  non-trivial  iteration,  the  number  of  nodes  in 
the  graph  decreases,  thus  the  algorithm  terminates. 

Second  we  prove  that  G'  is  acyclic.  Note  that  line  23  is  the  only  place  in 
the  algorithm  with  the  potential  to  introduce  cycles,  since  the  other  lines  that 
modify  the  graph  only  delete  edges  or  nodes.  If  the  contraction  of  the  elements 
of  Wq  into  a  single  node  were  to  create  a  cycle,  it  would  have  even  length  (since 
the  contraction  maintains  the  bipartite  nature  of  the  graph)  and  it  would  have 
to  contain  a  vertex  outside  of  Wq  U  Wor,  since  we’ve  deleted  the  edges  from 
Wq  into  Wor •  If  such  a  cycle  exists,  then  contracting  Wq  U  Wor  into  a  single 
node  (and  deleting  edges  from  that  node  to  itself)  would  produce  a  graph  with 
a  cycle  (derived  from  the  other  cycle),  which  Lemma  1  proves  is  impossible. 

Next  we  prove  that  for  each  v  £  Vq,  all  elements  of  S'(v)  are  equivalent  given 
/\(iti  -o-  Vi).  This  requires  showing  that  our  graph  modifications  are  valid  given 
the  assertions  in  Q1  and  that  any  new  assertions  we  add  to  Q  are  valid.  For  an 
assertion  of  the  form  “node  v  is  false ” ,  our  only  action  on  the  graph  is  to  delete 
every  edge  into  v.  Since  all  such  edges  are  from  OR-nodes,  this  is  equivalent 
to  removing  v  from  a  disjunction,  which  is  valid  given  the  assertion.  Any  child 
w  of  v  is  an  OR-node  which,  by  definition,  is  equivalent  to  v  and  therefore  is 
equivalent  to  false.  This  triggers  two  kinds  of  new  assertions:  first  that  any 
child  of  w  is  false ,  which  is  valid  since  a  disjunction  is  false  if  and  only  if  each 
disjunct  is  false,  and  second  that  any  other  parent  of  w  is  false,  which  is  clear 
since  an  OR-node  is  equivalent  to  its  parent.  These  are  exactly  the  assertions 
added  by  the  algorithm. 

For  an  assertion  of  the  form  “nodes  u  and  v  are  equivalent” ,  we  do  more.  Point 
5  of  Theorem  2  shows  that  all  nodes  that  are  on  paths  from  u  to  v  or  v  to  u 
are  equivalent  —  these  are  Wq  U  Wor ■  Any  parent  of  an  element  of  Wor  is 
clearly  equivalent  to  the  elements  of  Vq ,  so  the  new  assertions  added  in  line 
22  are  justified.  The  algorithm  deletes  all  nodes  in  Wor,  then  contracts  Wq 
into  a  single  new  vertex  x.  Clearly  contracting  Wq  into  a  single  new  node  is 
justified.  Deleting  Wor  is  justified  by  the  observation  that  any  w  £  Wor  has 
a  parent  a  £  Vq  and  a  child  b  £  Vq  and,  by  Theorem  2 ,  a  4$  w  4$  b.  Thus,  we 
can  eliminate  all  children  other  than  b  from  w  without  changing  its  meaning, 
at  which  point  w  provides  no  information  other  than  that  a  44  b,  which  the 
contraction  makes  explicit,  and  that  any  other  parent  of  w  is  equivalent  to  b 
which  the  contraction  and/or  the  assertions  previously  added  make  explicit. 
This  concludes  the  proof  of  Point  1 . 

Point  2  is  clear  from  the  fact  that  the  algorithm  only  terminates  when  the  queue 
is  empty,  which  means  that  all  the  input  assertions  have  been  processed.  Point 
3  essentially  asserts  that  all  the  rewritings  represented  in  G  are  represented  in 
G' ,  except  that  some  redundant  disjuncts  may  have  been  deleted.  This  should 
be  clear,  though  we  point  out  that  new  rewritings  may  be  represented  in  G' . 

□ 
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Algorithm  3  uses  Algorithm  2  to  transform  a  rewrite  graph  into  an  equivalent 
rewrite  graph.  Essentially,  all  it  has  to  do  is  detect  violations  of  the  conditions 
from  the  definition  of  reduced  rewrite  graph,  convert  them  into  assertions  in  a 
queue,  and  call  Algorithm  2  to  reorganize  the  graph  based  on  the  assertions. 


Algorithm  3  Rewrite  graph  to  reduced  rewrite  graph.  Input:  G  = 
(Vq,Vor,  E,  S),  a  rewrite  graph  with  root  node  r  €  Vq.  Output:  G  is  trans¬ 
formed  into  an  equivalent  reduced  rewrite  graph.  Note:  Assume  for  all  v  £  Vq, 
true  S(v). 

1:  while  G  is  not  reduced  do 

2:  set  Q  to  an  empty  queue 

3:  for  all  v  £  Vq  such  that  false  £  S(v)  do 

4:  enqueu e((false,v))  in  Q 

5:  for  all  u,v  £Vq  such  that  S(u)  fl  S(v)  0  do 

6:  enqueue((tt,  v))  in  Q 

7:  for  all  w,  x  £  Vqr  such  that  w  and  x  have  the  same  out-neighbor  set  do 

8:  for  each  in-neighbor  u  of  x,  add  edge  (u,  w) 

9:  delete  x 

10:  for  all  u,v  £Vq  that  share  out-neighbor  w  £  Vor  do 

11:  enqueue((tt,  v))  in  Q 

12:  for  all  w  £  Vor  with  no  out-neighbors  do 

13:  for  each  u  with  an  edge  to  w  enqueue((/aise, «))  on  Q 

14:  delete  w 

15:  for  all  w  £  Vor  with  exactly  one  out-neighbor  v  do 

16:  for  each  u  with  an  edge  to  w  enqueue((it,  v))  on  Q 

17:  delete  w 

18:  Reorganize^,  Q) 


4.2  Searching  with  rewrite  graphs 

With  the  machinery  of  the  reduced  rewrite  graph  in  place,  our  view  of  search 
changes  a  bit.  We  will  use  a  reduced  rewrite  graph  to  represent  the  formulas 
discovered  by  searching. 

The  search  starts  with  an  input  formula  /,  and  a  rewrite  graph  consisting  of  a 
single  Q-node  r  such  that  S(r)  =  {/}.  As  before,  each  search  iteration  consists 
of  choosing  a  formula  g  and  generating  all  the  subformulas  produced  by  applying 
rewrite  operators  to  g.  The  difference  is  that  the  formula  we  choose  is  not,  in 
general,  equivalent  to  the  input  /.  Instead,  it  is  an  element  of  S(u)  for  some 
Q-node  u  in  the  rewrite  graph,  if  a  new  formula  h  is  generated  that  has  no 
disjunctions,  it  simply  gets  added  to  S(u).  If  a  new  formula  hi  V  •  •  •  V  hk  is 
generated,  a  new  OR-node  child  of  u  is  created,  and  for  each  hi,  a  new  Q-node 
Vi,  where  S(vi)  =  {fti},  is  created,  all  of  which  are  children  of  the  new  OR-node. 
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Subformulas  produced  by  rewrite  operators  may  be  normalized  to  false  (or 
true  if  we’re  really  lucky,  since  the  entire  search  can  be  terminated  and  true 
returned  at  that  point),  or  may  be  normalized  to  a  formula  that  has  already 
been  generated.  Either  event  triggers  a  reorganization  of  the  rewrite  graph  along 
the  lines  of  Algorithm  3  in  order  to  keep  it  reduced.  This  reorganization  never 
creates  new  formulas,  it  merely  moves  them  to  different  nodes,  or  removes  them 
from  the  graph  altogether. 

Search  terminates  when  there  are  no  subformulas  in  nodes  that  have  not  already 
been  rewritten  using  the  rewrite  operators.  At  this  point,  the  rewrite  graph 
implicitly  represents  all  the  rewritings  generated  by  the  given  operators.  From 
it,  we  can  very  easily  pull  out  the  rewriting  that  maximizes  any  grading  function 
that  distributes  over  disjunctions,  e.g.  a  function  satisfying  p{f  V  g)  =  p(f)  + 

p{g )■ 

Search  based  on  rewrite  graphs  is  a  vast  improvement  over  the  generic  AI- 
inspired  search  for  two  reasons:  1)  we  avoid  the  inefficiencies  outlined  in  Sec¬ 
tion  2,  and  2)  we  are  able  to  expoit  information  discovered  during  the  search  to 
throw  out  subformulas  —  often  before  they  are  ever  rewritten. 


5  Implementation:  qfr 


We  have  implemented  the  approach  described  above  in  a  program  called  qfr 
quantified  formula  rewriting.  The  program  makes  some  assumptions  that  are 
not  required  by  the  framework  described  above,  most  notably  that  polynomials 
are  always  kept  in  fully  factored  form,  and  that  sets  associated  with  Q-nodes 
contain  no  disjunctions.  These  design  decisions  simplified  the  system,  but  limit 
it  to  some  extent.  Additionally,  our  framework  requires  three  components: 

1.  a  normalizer, 

2.  a  set  of  rewrite  operators,  and 

3.  a  mechanism  for  choosing  the  next  formula  to  process. 


Several  good  normalization  and/or  rewrite  operations  come  from  the  discussion 
of  formula  simplification  in  [7] ,  and  the  descriptions  of  the  method  of  quantifier 
elimination  by  virtual  term  substitution  given  in  [14]. 
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5.1  The  normalizer 


The  role  of  the  normalizer  is  two-fold:  making  formulas  that  are  essentially 
identical  syntactically  identical  (and  thus  easy  to  identify  as  “the  same”),  and 
discovering  unsatisfiable  formulas.  In  fact,  it’s  probably  desirable  to  separate 
these  two  activities,  though  qfr  doesn’t.  We  identified  four  different  “levels”  of 
simplification.  Levels  1  and  2  assume  that  atomic  formulas  are  the  fully  factored 
iormp1p2  ■  ■  -Ph® 0,  but  never  examine  the  factors  themselves.  Levels  3  and  4  try 
to  deduce  information  about  the  factors  themselves.  Normalization  operates  on 
a  conjunction. 


1.  Level  1  normalization  simplifies  atomic  formulas  individually:  removing 
content,  eliminating  exponents  from  =  and  y^,  normalizing  all  exponents 
to  1  and  2  in  >  and  <,  splitting  >  and  <  into  even  and  odd  factors  (odds 
get  exponent  1  and  stay  in  the  inequality,  evens  get  exponent  1  as  well, 
but  broken  into  y^  atoms),  and  breaking  up  atoms  like  p\  ■  ■  ■  pk  y^  0  into 
pi  y^  0  A  •  •  •  A  pk  y^  0. 

2.  Level  2  normalization  assumes  Level  1  normalization  of  atoms,  and  simpli¬ 
fies  the  conjunction  by  merging  atomic  formulas  with  the  same  left-hand 
side  and,  when  possible,  using  (in) equalities  on  a  single  factor  to  simplify 
multi-factor  (in)equalities  —  e.g.  simplifying  x  +  1  >  0  A  (a:  +  1)  (a;2  —ax  + 
b)  <  0  to  x  +  1  >  0  A  x2  —  ax  +  b  <  0. 

3.  Level  3  simplification  attempts  to  determine  sign  conditions  of  variables 
implied  by  the  formula,  and  use  those  sign-conditions  to  determine  whether 
sign-conditions  on  some  factors  are  implied  by  the  formula.  For  example, 
given  x  —  1  >  0  A  x  +  y2  +  1  <  0,  Level  3  normalization  would  deduce 
that  x  —  1  >  0  implies  x  >  0.  It  would  then  determine  that  with  x  >  0, 
x  +  y2  +  1  must  be  positive,  and  thus  that  the  formula  is  unsatisfiable. 

4.  Level  4  normalization  uses  the  implied  sign-conditions  on  variables  along 
with  sign- conditions  on  one  other  factor  to  try  to  deduce  sign-conditions 
on  a  factor.  For  example,  given  £+1  >  0  A  2a;  +  y2  +  2  <  0,  Level  4 
normalization  would  deduce  that  2x  +  y2  +  2  —  2{x  +  1)  <  0,  which  implies 
y2  <  0,  and  thus  the  input  is  unsatisfiable. 


The  system  allows  some  control  over  what  normalization  gets  done,  and  whether 
it  is  only  used  for  determining  unsatisfiability,  or  whether  it  is  also  used  for  sim¬ 
plification.  All  levels  are  very  fast  relative  to  polynomial  factorization,  which  is 
the  biggest  bottleneck  in  the  system.  Obviously,  Level  4  normalization  depends 
quadratically  on  the  number  of  factors  in  the  conjunction,  which  makes  it  the 
most  time  consuming  of  the  levels.  On  the  other  hand,  discovering  unsatisfi¬ 
able  subformulas,  or  strengthening  an  inequality  into  an  equality  can  have  an 
immense  global  impact.  So  it  is  worth  expending  some  time. 
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There  are  several  other  operations  that  are  probably  worth  incorporating,  in¬ 
cluding  breaking  sums  of  squares  equalities  into  separate  equalities,  and  using 
more  sophisticated  tests  for  unsatisfiability. 


5.2  The  rewrite  operators 

Rewrite  operators  generate  the  space  that  we  search,  and  thus  determine  what 
we  can  find.  However,  too  many  operators,  or  operators  that  generate  too 
many  rewritings  can  be  a  problem,  as  they  will  swamp  the  system.  Ultimately, 
a  heuristic  guided  search  of  the  space  should  ameliorate  such  problems.  For 
purposes  of  this  paper,  however,  we  always  search  the  entire  space  of  rewritings. 
The  following  rewrite  operators  are  in  the  current  version  of  qfr. 


1.  Splitting  multi- factor  equalities: 

(n*U Pi  =  0)  A  F  — -  vjLi  (Pi  =  o  A  F) 

2.  Linear  substitution: 

ax  +  b  =  0  A  F  — >  (a  0  A  F\x< _ b/a)  V(a  =  0A6  =  0AF) 

3.  Linear  substitution  for  xk:  if  x  only  occurs  in  F  to  powers  that  are  mul¬ 
tiples  of  k,  and  t  =  (k  +  1)  mod  2  then 

axk  +  b  =  0  A  F  — >  (a/OA  F\xht _ b/a  A  tba  <0)v(a  =  0A&=0A.F) 

4.  Linear  S-polynomial  reduction: 

/  =  0A<7  =  0AF  — *  af  —  bg  =  0  A  g  =  0  A  F, 

where  a,  b  £  Z  —  {0},  and  af  —  bg  is  linear  in  some  quantified  variable  x, 
but  neither  f  nor  g  are  linear  in  x. 


Linear  substitution  and  factor  splitting  are  the  most  obvious  operators.  Linear 
substitution  for  xk  has  been  identified  as  crucial  by  the  implementors  of  Redlog. 
Linear  S-polynomial  reduction  is  there  by  virtue  of  the  fact  that  it  is  something 
that  we’ve  used  by  hand  in  the  past.  It  would  be  interesting  to  look  at  a 
more  general  idea  of  this  kind,  i.e.  Grobner  style  reduction  by  equations  to 
produce  linear  polynomials  that  can  be  substituted.  However,  the  complexity 
increases  when  more  polynomials  can  be  involved,  and  multiplying  by  non¬ 
constant  factors  generates  more  case  distinctions.  So  adding  such  an  operator 
would  have  to  be  carefully  thought  out. 
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5.3  Controlling  the  search 


The  basic  mechanism  behind  the  search  in  qf  r  is  a  priority  queue  of  subformulas. 
The  program  is  essentially  no  more  than  a  loop  consisting  of  1)  dequeuing  a 
formula,  2)  applying  all  relevant  operators  to  that  formula,  3)  adding  formulas 
the  queue  (if  they  are  new)  and  to  the  rewrite  graph,  and  4)  reorganizing  the 
rewrite  graph  as  needed. 

The  priority  queue  in  the  version  of  qfr  described  here  simply  follows  a  fewest- 
quantified-variables-first  rule,  with  ties  broken  by  the  printed  length  of  the  for¬ 
mula.  Once  the  entire  space  has  been  searched,  we  print  out  the  “best”  rewriting 
based  on  a  simple  grading  scheme  —  although  determining  what  is  the  “best” 
rewriting  implicit  in  the  graph  is  outside  the  scope  of  this  paper. 


6  Performance  on  example  problems 

The  fundamental  question  for  this  report  is  whether  or  not  the  system  can 
search  the  entire  space  of  formula  rewritings  for  interesting  sized  input  formulas, 
and  how  large  that  space  is.  We  will  focus  on  a  few  example  problems,  and 
demonstrate  that  it  can  in  many  instances.  Moreover,  we’ll  see  instances  in 
which  it  does  better  than  QEPCADB,  Redlog  or  Mathematica  individually.  All 
timings  given  are  on  a  1.6  GHz  Pentium  with  512MB  of  memory. 


6.1  The  SEIT  Problem 


First  we  consider  the  example  from  the  introduction:  the  system  of  equations 
arising  from  finding  the  equilibrium  points  of  the  SEIT  model  described  in  the 
introduction.  As  a  disclaimer,  it  should  be  pointed  out  that  the  inability  of 
QE  software  to  handle  this  problem  motivated  the  present  work.  So  the  fact 
that  qfr  does  a  good  job  with  it  shouldn’t  be  too  surprising.  Hopefully  it  is 
an  instructive  illustration  none  the  less.  Recall  that  this  quantifier  elimination 
problem  is: 


35,  E,  I,  T 


d  —  dS  —  Pi  IS  =  0  A  vE  —  (d  +  r2  )I  =  OA 

PiIS  +  P2IT  —  (d  +  v  +  r\)E  +  (1  —  g)r2/  =  OA 

— dT  +  r\E  +  <5x2/  —  P2TI  =  0/\E>0Al  >0AT>0A5>0 


where  d>0Au>0Ari  >0Ar2  >0A(?>0A/3i  >  P2  >0 


qfr  is  able  to  search  the  entire  space  of  rewritings  of  this  problem  with  the  given 
operators  in  87  seconds.  The  rewrite  graph  grows  and  contracts  in  the  process, 
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Figure  2:  Rewrite  graph  resulting  from  searching  the  full  space  of  rewritings  for 
the  SEIT  problem. 


as  Q-nodes  are  discovered  to  be  false  or  distinct  Q-nodes  are  discovered  to  be 
equivalent.  In  the  end,  the  graph  (see  Figure  2)  contains  just  10  Q-nodes  and  six 
OR-nodes,  with  759  conjunctions  distributed  across  the  Q-nodes.  It  represents 
118,535  distinct  rewritings  —  and  that  is,  of  course,  after  having  thrown  away 
many  disjuncts  that  it  discovered  to  be  unsatisfiable  or  redundant.  The  “best” 
formulation  found  is 


3s 


s  >  0  A  s  —  1  <0A  (3\sqr2V  +  /3iS2/32f  —  dsfov  —  flisfov  +  d/32n  —  P2s2v 
+/3idsv  +  /3isrir2  —  ds/32r2  +  dp2V2  +  Pidsr2  +  Pidsri  -  d2s/32 
+d2/32  +  Pi  d2s  =  0A 

Pi  >0Ap2  >0Ad>0Au>0Ari  >0Ar2  >0Ag>0A/32  —  Pi  <0 


and,  in  fact,  qfr  finds  that  rewriting  less  than  a  second  into  its  search.  QEP- 
CADB  is  able  to  solve  this  formulation  quite  easily  and  with  the  simplest  pos¬ 
sible  answer,  provided  the  conditions  on  the  free  variables  are  passed  along  as 
“assumptions” .  Redlog  and  Mathematica  solve  the  above  QE  problem  instantly 
as  well,  but  with  larger  formulas. 
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6.2  The  Joswig-Witte  Problem 


In  [10],  the  truth  of  a  certain  conjecture  in  shown  to  depend  upon  the  satisfia¬ 
bility  of  the  following  system  of  equations  and  constraints: 

f  1  +  s2x  1X3  +  s8X2X3  +  s19£i£2£4  =  0  j 


3s,£l,a;2,£3,X4 


A 

X\  +  SSX 1X2X3  +  s19x  2X4  =  0 
A 

X2  +  suxiX4  +  s10X3a:4  =  0 

A 

S4X 1X2  +  X3  +  S19£ 1X3X4  +  S24X2X3X4,  =  0 


A 

£4  +  S44  X1X2X3X4  =  0 


A 

0  <  s  <  1 


The  high  degrees  of  s  in  this  system  makes  it  difficult  to  solve.  Even  computing 
a  Grobner  basis  requires  many  hours  and  the  right  program,  qfr  is  able  to 
search  the  entire  space  of  rewritings  for  this  problem  in  less  than  5  1/2  minutes. 
The  “best”  rewriting  it  discovers  isXS 

_.  s>0As-l<0Aa:i/0A2:i  +  l/  0Aa:i-l/0 

dS,:ri  [  As23£i-1^0AP  =  0AQ  =  0  J’ 

where  P  and  Q  are  relatively  large  polynomials  in  s  and  x\.  This  two  variable 
system  is  found  to  be  unsatisfiable  in  135  seconds  by  cad2d,  a  special  version  of 
QEPCADB  that  is  optimized  for  2D  CAD  construction.  For  technical  reasons2, 
we  must  add  the  constraint  resXl(P,Q)  =  0  to  the  system,  otherwise  cad2d 
doesn’t  restrict  its  lifting  to  the  points  at  which  resXl  (P,  Q )  vanishes,  as  it  ob¬ 
viously  should.  Essentially,  the  2-variable  system  is  solved  by:  1)  isolating  roots 
of  the  resultants,  2)  throwing  away  those  that  fall  outside  s- interval  (0, 1),  3) 
for  each  remaining  root  a  constructing  interval  polynomials  containing  P(a,  x\) 
and  Q(a,x\),  4)  isolating  the  roots  of  these  interval  polynomials,  and  5)  verify¬ 
ing  that  the  isolating  intervals  for  the  two  polynomials  are  disjoint.  The  interval 
Descartes  root  isolation  method  described  in  [6],  for  example,  is  able  to  do  this 
quite  quickly. 


Figure  3  shows  the  final  rewrite  graph  for  this  problem.  It  consists  of  20  OR- 
nodes  and  41  Q-nodes  containing  1841  conjunctions.  In  this  case  the  graph  is  a 
tree,  it  represents  12,191  distinct  rewritings  of  the  input.  The  tree  is  less  than 
half  the  size  if  we  don’t  use  the  “linear  S-polynomial”  rewriting  operator,  and 
the  “best”  rewriting  is  unchanged.  This  gives  some  indication  that  ramping 
up  the  number  of  operators  available  is  likely  to  make  a  heuristic  approach  to 

2Equational  constraints  are  not  implemented  in  cad2d,  which  is  why  the  propagated  con¬ 
straint  reSxi(P,  Q)  must  he  added  to  the  formula  explicitly  by  hand. 
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Figure  3:  Rewrite  graph  resulting  from  searching  the  full  space  of  rewritings  for 
the  Joswig-Witte  problem. 


limiting  the  search  more  necessary.  It’s  also  worth  pointing  out,  that  much  of 
the  work  of  the  search  went  into  exploring  cases  that  were  eventually  found  to 
be  unsatisfiable,  and  those  portions  of  the  graph  were  then  removed. 


6.3  The  Wang  Xia  Problem 


In  [13],  Wang  &  Xia  analyze  a  system  of  five  equations  in  five  variables,  with 
inequality  side  constraints,  and  a  single  parameter,  v.  The  system  describes  the 
equilibrium  points  of  a  biological  model.  The  system  was  “solved”  in  the  fol¬ 
lowing  sense:  the  positive  v  axis  was  decomposed  into  intervals,  inside  of  which 
the  number  of  solutions  to  the  system  was  constant  as  v  varied.  The  approach 
was  based  on  computing  “border  polynomials”  [16].  No  timing  information  was 
given. 

The  system  they  considered  is  the  following: 

'  150000v23  +  750vz3x  -  599999a:  +  200  =  0 
A 

625yj  +  750000j/3  +  625y3yi  +  19200a:2/i  —  8xyf  —  8xy\y3  —  900000000  =  0 
A 

—  11520000a:  +  9600a:2/i  +  8xy\y3  +  8xy3  +  1500000j/3  —  625i/3yi  —  625yf  =  0 
A 

250 z\  +  75000Z3  +  25OZ3Z1  +  I8OO1/3Z1  —  3y3zj  —  3y3ziz3  —  22500000  =  0 
A 

— 270000i/3  +  900i/3Zi  +  3y3ziz3  +  3y3z3  +  150000^3  —  25OZ3Z1  —  250z|  =  0 
A 

x>0Ayi>0A2/3>0Azi>0Az3>0A  1200  —  yi  —  y3  >  0  A  300  —  zi  —  z3  >  0 

They  were  interested  in  the  solutions  to  this  system  in  the  parameter  v.  By 
keeping  track  of  the  operators  applied  during  search,  qfr  is  actually  able  to 
produce  solutions  to  the  original  problem  from  solutions  to  the  simplified  prob¬ 
lem,  even  when  its  rewriting  steps  eliminate  quantifiers  —  at  least  for  the  set  of 
rewrite  operators  considered  here.  Quantifying  all  variables  but  v  in  the  above 
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system,  qfr  searches  the  whole  space  in  64  seconds,  producing  as  its  “best' 
formula 


2/3  >  0  A  z3  >  0 
A 

(750 vz3  -  599999)  (66750^232/3  -  14999911j/3  +  57600000^3  +  76800)  <  0 

A 

27 4yl  +  81002|y|  -  2430000232/1  -  729000000y| 

-22502|y|  +  6750002|2/|  +  405000000232/1 
-1875002|2/3  +  56  2  5  0  0  0  0232/3  +  33  7  5  0  0  0  0  0  0  0232/3 
+1562500023  -  140625000002|  =  0 

A 

32I2/3  -  2  7  0  0  002/3  -  2  50 z\  +  15000023  <  0 
A 

3232/3  +  9002/3  -  25023  >  0 
A 

(66750u232/f  -  149999112/f  -  45000000u232/3 
+359999400002/3  -  69120000000t>23  -  92160000) 

(66750u232/3  -  149999112/3  +  57600000u23  +  76800)  <  0 

A 

43441734375000c32|2/|  +  2753611266937500u2zf2/f 
— 7818742657268310750u232/f  +  11249895750048951029732/f 
+41616956250000000c32|2/| 

-15960208532175000000c22|2/I 

+19507457439221957100000U232/I 

-40499739900283739013524002/1 

-78356160000000000000u32|2/3 

+7257286575360000000000u22|2/3+ 

10368019353182100480000000z)  232/3 

+138240129022142668800002/3  -  63700992000000000000000u32| 
-254803968000000000000u22| 

— 339738624000000000v23  -  150994944000000  =  0 


This  result  was  obtained  by  making  the  following  substitutions: 


6253/3+2400+3/3+yi) 

8(y3+ 1200)  (2/3+2/1  —  1200) 

_  —2700  002/3  —  2502:|  +  1500005:3+32|2/3 

Zl  ~  3232/3+9OO2/3  —  25O23 

66750v232/3 —  149999 11 2/3 —45000000v232/3  +359999400002/3  —  69 120000000i;23  —92160000 
V1  ~  66750(123  2/3  — 149999 11  ;/3  +57600000v23  +76800 

It  is  important  to  note  that  during  this  search  process  qfr  deduced  that  the 
original  system  implies  that  the  denominators  in  each  of  these  substitution  ex¬ 
pressions  are  non-zero. 


QEPCADB  can  be  used  to  “solve”  the  system  in  the  same  sense  as  Wang-Xia’s 
solution:  namely  that  the  n-axis  is  decomposed  into  open  intervals  in  which  the 
number  of  real  solutions  is  constant  (In  fact,  on  each  interval  the  solutions  are 
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defined  by  a  finite  set  of  real-valued  functions  of  the  interval.),  and  the  system 
is  solved  at  a  sample  value  for  v  from  each  interval.  This  requires  less  than  90 
seconds. 

To  understand  the  value  of  the  rewriting,  we  first  note  that  QEPCADB  is  un¬ 
able  to  solve  the  original  formulation  of  the  problem  —  even  when  given  the 
measure-zero-error  option,  which  corresponds  to  Wang-Xia’s  ignoring  finitely 
many  values  of  v.  Redlog  3.1  gives  an  “Arithmetic  exception”  error  and  fails 
using  rlgqe,  the  “generic”  quantifier  elimination  option.  Mathematica  fails 
to  give  an  answer  after  more  than  4  hours  —  although,  in  fairness,  we  are  un¬ 
aware  of  any  way  to  indicate  to  it  that  finitely  many  v  values  may  be  ignored. 
QEPCADB,  as  mentioned,  can  solve  the  rewritten  system,  and  Mathematica 
doubtless  could  as  well  if  there  were  a  way  to  tell  it  to  solve  the  problem  “gener- 
ically” ,  i.e.  not  to  lift  over  section  cells  in  u-space. 

To  demonstrate  that  finding  this  rewriting  is  not  trivial,  we  consider  asking 
Redlog  and  Mathematica  to  eliminate  {y\,zi,x}  from  the  original  input.  Both 
require  about  one  second  to  compute  a  result.  Redlog  eliminates  x  and  z\, 
but  not  j/i,  and  returns  a  formula  consisting  of  50729  characters  (not  counting 
whitespace).  Mathematica  eliminates  all  three  variables,  returning  a  formula 
consisting  of  26683  characters.  The  formula  returned  by  qfr  consists  of  875 
characters. 


6.4  Computing  possible  branch  cuts 


This  example  stems  from  the  approach  developed  in  [2,  1]  for  simplification  of 
expressions  containing  inverse  elementary  functions.  The  question  is  to  deter¬ 
mine  potential  branch  cuts  for  the  function 


where  p,  q  £  C.  “Potential  branch  cuts”  consist  of  the  union  of  the  cuts  for 
Rp  and  yfq  along  with  the  points  in  pq- space  that  get  mapped  by  Rp  —  Rq  to 
the  negative  real  axis,  or  zero.  The  later  set  is  characterized  by  the  following 
formula,  which  represents  a  complex  number  as  the  ordered  pair  given  by  its 
real  and  imaginary  parts.  Note:  variable  u  represents  Rp,  and  v  represents  Rq. 


Hi  J  -t-Ui  -‘-v 


Rp  =  R2U  -  I2  A  lp  =  2 RUIU  A  [Ru  >  0  V  Ru  =  0  A  Iu  >=  0] 
A 

Rq  =  R2  -  I2  A  Iq  =  2 RVIV  A  [Rv  >  0  V  Rv  =  0  A  Iv  >=  0] 
A 

Iu  -  Iv  =  0  A  Ru  -  Ru  <=  0 


After  3  seconds,  qfr  completes  the  search  of  the  entire  space  of  rewritings. 
The  rewrite  graph  consists  of  two  OR.-nodes  and  six  Q-nodes  containing  185 
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conjunctions.  The  “best”  rewriting  is  given  by  the  disjunction  of  four  quantified 
formulas: 


3 Ru  .  R 
V 
3  L 


Ru  >  0  A  Rv  >  0  A  Rv  -  Ru  >=  0  A  Ip  =  0  A  Iq  =  OA 
R%  -  Rq  =  0  A  R\  -  Rp  =  0 


(IV)(IP)  >  0  A  (J„)(/,)  >  0  A  {Iv){Iq  -  Ip)  >=  OA 
I2n  -  4/,2  R.„  -  4 It  =  0  A  Alt  +  ARvtt  -  I2  =  0 


V 


3 IV[IV  >  0  A  Ip  =  0  A  Iq  >  0  A  72  -  AIlRq  -  Alt  =  0  A  /2  +  Rp  =  0] 


V 

3  /„ 


Iv  >0A/p  =  0A/,  =  0A/-  -  Al;,Rq  -  Alt  =  OA 
Alt  +  ARpI2  -  I2  =  0  A  Ia  -  Iv  =  0 


At  first  blush,  this  might  not  seem  like  much  of  an  improvement,  but  QEP- 
CADB  solves  each  piece  of  this  extremely  quickly,  providing  a  simple  solution; 
Mathematica  quickly  gives  an  answer  answer  that’s  about  half  the  size  of  what 
it  produces  from  the  original  input;  and  Redlog  quickly  produces  a  formula 
consisting  of  two  simple  quantifier-free  pieces  and  two  simple  pieces  with  one 
quantified  variable,  as  opposed  to  the  formula  consisting  of  six  pieces,  each  with 
one  quantified  variable. 


6.5  The  REMIS-Patterson  problem:  poor  performance 

When  qfr’s  rewrite  operators  simply  don’t  apply,  for  example  formulas  with¬ 
out  equalities,  the  whole  approached  described  here  has  nothing  to  offer.  On 
the  other  hand,  it  also  takes  no  time  to  run  it  and  discover  that  fact.  Poor 
performance  is  when  the  program  runs  for  a  long  time  without  discovering  in¬ 
teresting  reformulations  of  the  problem.  In  this  paper,  we  are  not  so  much 
aiming  at  finding  good  formulations  quickly,  as  at  exploring  the  entire  space  of 
reformulations.  It  is  fully  expected  that  the  incorporation  of  some  heuristics  to 
rule  out  unpromising  regions  of  the  search  space  would  drastically  reduce  the 
search  time,  without  significantly  degrading  the  quality  of  rewritings  discovered. 
However,  without  such  heuristics,  we  must  regard  a  problem  for  which  the  time 
to  search  the  whole  space  is  “unreasonably  large”  as  providing  an  example  on 
which  the  algorithm  performs  poorly. 

The  REMIS  data-base3  includes  many  real  quantifier  elimination  problems.  One 
of  these  is  the  “Patterson  Problem”  [8].  The  existential  formulation  of  this 

3http : //tow . algebra . f im . uni-passau . de/”redlog/remis/ 
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problem  is: 


3x1,x2,x3,x4 


{y  —  u3)x2  +  (— x  +  u2)xi  —  u2y  +  u3x  =  0  A  2uix2  —u\  =  0 
A 

yx 4  +  (— x  +  u\)x3  —  u±y  =  0  A  2u2X4  +  2u3x3  —  u3  —  =  0 

A 

(' U\U3X2  +  UiU2Xi)X4  +  (—UiU2X2  +  U\U3Xi)x3 
+  (— UlU3  —  Ullfyxi  ^  0 


This  formula  is  in  some  sense  set  up  for  qfr  to  perform  poorly.  It  models  a  ge¬ 
ometric  configuration,  but  leaves  out  all  the  non-degeneracy  conditions  —  this 
was  purposely  done  to  demonstrate  how  Redlog’s  “generic”  quantifier  elimina¬ 
tion  discovers  non-degeneracy  conditions  for  itself.  If  we  add  those  conditions 
(uiu2  —  u2x  —  u3y  ^OAu2-i/0Aj/^0),  qfr  performs  acceptably.  It 
searches  the  entire  space  of  rewritings  in  27  seconds,  producing  a  search  tree 
with  23  OR-nodes  and  1176  conjunctions  distributed  across  27  Q-nodes. 

However,  without  the  non-degeneracy  conditions,  qfr  takes  about  17  minutes 
to  complete  its  search  —  spending  all  that  extra  time  exploring  portions  of 
the  space  of  rewritings  that  correspond  to  cases  that  are  simply  not  of  interest. 
Moreover,  because  the  only  operator  qfr  has  for  eliminating  quantified  variables 
applies  only  to  equations,  it  returns  a  large  formula  in  which  many  disjuncts 
have  quantified  variables  that  can  be  eliminated  trivially  —  for  example: 

3a:s  [x  —  ii2  ^  0  A  «i  -  x  =  0Ay  =  0Au3  =  0Au2  7^0Ax7^0Ax3^0] 

For  humans,  “3x3 [x’3  ^  0]”  is  pretty  easily  recognized  as  true!  Since  the  basic 
approach  of  this  work  has  been  to  ensure  that  the  program  can  at  least  do  the 
things  that  people  can  easily  do,  we  should  add  a  rewrite  operator  that  does 
quantifier  elimination  for  subformulas  of  the  form  3x[ax  +  b  <j  0]  even  when  er 
is  not  “=” . 


7  Conclusions  and  Future  Work 


This  paper  paper  considers  the  problem  of  finding  rewritings  of  quantified  input 
formulas  that  make  good  inputs  to  quantifier  elimination  programs.  The  essen¬ 
tial  motivation  for  this  is  the  observation  that  quantifier  elimination  algorithms 
do  not  do  a  good  job  of  exploiting  problem  structure,  and  that  by  rephrasing 
QE  problems,  experts  are  often  able  to  make  more  effective  use  of  QE  software. 
We  have  presented  a  data  structure,  the  “rewrite  graph”,  and  an  algorithm 
based  on  that  data  structure  that  explores  the  space  of  rewritings  of  the  input 
formula  induced  by  a  set  of  rewrite  operators.  We  have  shown  that  even  for 
some  non-trivial  problems,  the  method  is  able  to  explore  the  entire  space  of 
rewritings  and  discover  good  rewritings. 
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There  are  many  directions  for  future  work  first  and  foremost  is  to  investigate 
heuristic  methods  for  guiding  the  search  for  rewritings.  Ultimately  we  would 
like  to  find  good  answers  without  having  to  search  the  entire  space  of  rewritings. 
A  heuristically  guided  approach  is  likely  to  become  increasingly  necessary  as  the 
number  of  rewrite  operators  available  to  the  system  grows. 

Another  direction  for  work  concerns  the  “next  step”  —  i.e.  the  passing  on  to 
one  or  more  QE  systems  of  pieces  of  the  reformulated  problem.  It  may  be  better 
not  to  keep  the  search  for  rewritings  and  the  use  of  QE  algorithms  separate. 
Instead,  as  qfr  recognizes  that  a  node  contains  a  quantified  formula  that  would 
be  particularly  easy  for  an  available  system,  it  could  simply  call  the  system 
directly  on  that  node.  The  result  of  the  QE  —  particularly  if  that  node  is  found 
to  be  unsatisfiable  —  could  have  a  big  impact  on  the  subsequent  search. 
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8  Appendix 


Lemma  1  Let  G  =  (V,  E)  be  a  directed  acyclic  graph.  Let  u  and  v  be  vertices 
such  that  there  is  no  path  from  v  to  u.  If  S  is  the  set  of  all  vertices  along  paths 
from  v  to  u,  then  the  graph  G  obtained  from  G  by  contracting  V'  U  {u,  v}  into 
a  single  new  vertex  x  ( cutting  out  edges  from  x  back  into  x)  is  a  DA  G. 


Proof.  Suppose  G  has  a  cycle  w\, . . .  ,Wk-  The  cycle  must  contain  vertex 
x  since,  otherwise,  the  same  vertices  would  form  a  cycle  in  G.  Let  Wi  =  x. 
Thus,  there  exist  vertices  a,b  £  Vr  U  {u,  v}  such  that  wi, . . . ,  Wi- 1,  a  and  and 
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b ,  Wi- (_i, . . .  ,Wk,w  i  are  paths  in  G.  Let  Pb  be  the  path  from  u  to  b  in  G,  and  let  Pa 
be  the  path  from  a  to  v  in  G.  Then  Pb  —*  tyj+i, . . . ,  Wk,  Wi,  W2,  ■  •  • ,  uii- 1  — >  Pa 
is  a  path  from  u  to  v  in  G.  Thus,  w\  =  W2  =  •  •  •  =  Wfc  =  x,  i.e.  the  cycle  is  an 
edge  from  x  back  into  x,  which  is  a  contradiction.  Q 
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